Ios demo polish by VyasGuru · Pull Request #454 · RunanywhereAI/runanywhere-sdks

VyasGuru · 2026-03-11T22:07:06Z

will post ss of testing in a bit

Summary by CodeRabbit

New Features
- Thinking Mode toggle in Settings, UI badge, per-model support, persisted across launches
- New base model and LoRA adapter entry with example prompts
- Voice interruption and resume controls; voice events include thinking content
Improvements
- Strips embedded "thinking" content from visible responses; thinking-aware prompts and generation flow
- Safer streaming with cancellation, improved UTF‑8 handling, and richer streaming metrics
- Camera lifecycle handling and enhanced tool-input validation
Bug Fixes
- Safer audio/TTS stop and prevention of overlapping model loads

Toggle-able thinking mode from settings Fixed race issue in model loading/cancelling UTF-8 parser now matches llamacpp one Fixed error in vlm where camera stayed on even when you exit that section Default sys prompt when loading

coderabbitai · 2026-03-11T22:07:28Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 49ceac21-50b2-4ea2-b2e5-1cbfca317a69

📥 Commits

Reviewing files that changed from the base of the PR and between 8aab1fa and c9c6ca6.

📒 Files selected for processing (1)

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelSelectionRows.swift

📝 Walkthrough

Walkthrough

Adds end-to-end "thinking" support: model metadata/registration changes, persisted thinking-mode settings and UI, extraction/handling of embedded content across generation and voice flows, streaming cancellation and UTF‑8 robustness in native backends, plus related UI, tooling, and utility updates.

Changes

Cohort / File(s)	Summary
App init & model registry `examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift`	Initialize `SettingsViewModel.shared` at startup; update `RunAnywhere.registerModel(...)` signature usages to include `supportsThinking: Bool`; add/adjust Qwen model registrations.
Chat view models `examples/ios/RunAnywhereAI/.../Features/Chat/LLMViewModel*.swift`, `.../LLMViewModel+Events.swift`, `.../LLMViewModel+Generation.swift`, `.../LLMViewModel+ModelManagement.swift`, `.../LLMViewModel+ToolCalling.swift`	Track `loadedModelSupportsThinking`, add setter, strip `<think>` tags for display during streaming and tool-calls, apply thinking-mode prefix to prompts and propagate supportsThinking on model load/status.
Chat UI `examples/ios/RunAnywhereAI/.../Features/Chat/Views/ChatInterfaceView.swift`	Add `thinkingModeEnabled` AppStorage and toolSettingsViewModel; add thinking-mode badge/toggle; adjust scrolling/typing indicator and input padding conditional on thinking/tool/LoRA state.
Models & LoRA `examples/ios/RunAnywhereAI/.../Features/Models/*`	Guard against re-entrant model loads (`isLoadingModel`); add a new LoRA adapter catalog entry and example prompts; show LoRA badge in model rows.
Settings UI & VM `examples/ios/RunAnywhereAI/.../Features/Settings/*`	Switch to `SettingsViewModel.shared`; add `thinkingModeEnabled` persistence and `loadedModelSupportsThinking` tracking via SDK notifications; add thinking-mode UI controls and description.
RAG (retrieval-augmented generation) `examples/ios/RunAnywhereAI/.../Features/RAG/*`	Introduce `RAGMessage` with `thinkingContent`; compute effective question using `/no_think\n` when appropriate; render expandable thinking summaries in UI and store both display and thinking content.
Voice & VoiceAgent API/flows `examples/ios/RunAnywhereAI/.../Features/Voice/.swift`, `sdk/runanywhere-swift/.../VoiceAgent/`	Propagate `thinkingContent` through voice events/results; add `interruptSpeaking`/`interruptPlayback` and `resumeListening`; include `thinkingModeEnabled`/`maxTokens` in session config; refactor voice session flow to transcription→generate (with thinking handling)→synthesize→playback.
Vision / camera lifecycle `examples/ios/RunAnywhereAI/.../Features/Vision/VLMCameraView.swift`	Add `scenePhase` handling to stop/start camera on lifecycle changes and ensure setup only when authorized.
AdaptiveMicButton & gestures `examples/ios/RunAnywhereAI/.../Helpers/AdaptiveLayout.swift`, `.../Voice/VoiceAssistantView.swift`	Add optional `onLongPress` to AdaptiveMicButton and wire long-press gesture; update voice UI to prioritize interrupt/resume/send flows.
Swift SDK — text generation & tool-calling `sdk/runanywhere-swift/.../RunAnywhere+TextGeneration.swift`, `.../RunAnywhere+ToolCalling.swift`	Add internal ThinkingContentParser; extract thinkingContent/thinkingTokens into generation results; handle `/no_think\n` prefix; streaming refactor for cancellation, metrics propagation and cleaner error handling.
Swift SDK — TTS & audio `sdk/runanywhere-swift/.../RunAnywhere+TTS.swift`, `.../TTS/Services/AudioPlaybackManager.swift`	TTS streaming context now owns audio Data; free ttsResult after use; compute duration from context; AudioPlaybackManager.stop() now guards on `audioPlayer != nil`.
Swift SDK — voice session & types `sdk/runanywhere-swift/.../VoiceAgent/*`	Add `thinkingContent` to VoiceAgentResult and events; extend VoiceSessionEvent and VoiceSessionConfig with thinking fields and maxTokens; add public interrupt/resume APIs.
Swift SDK — bridge & archive util `sdk/runanywhere-swift/.../Foundation/Bridge/CppBridge.swift`, `.../Infrastructure/Download/Utilities/ArchiveUtility.swift`	Parallelize destroy() calls for AI components during shutdown; fix compression_stream initialization and finalize parameter typing for decompression.
C++ LLaMA backend & LLM component `sdk/runanywhere-commons/.../backends/llamacpp/llamacpp_backend.cpp`, `.../features/llm/llm_component.cpp`	Add incremental DFA-based UTF‑8 scanner state for streaming validation and safer stop-window trimming; add atomic cancellation flag (`cancel_requested`) and per-stream cancel pointer, wire cancellation into token callbacks and prefer observed token counts on completion.
C++ tool-calling parsing `sdk/runanywhere-commons/.../features/llm/tool_calling.cpp`	Improve JSON key/value extraction to support scalar values and add a fallback that flattens top-level keys into arguments when no wrapper object present.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ChatUI as Chat Interface
    participant Settings as SettingsViewModel
    participant LLMVM as LLMViewModel
    participant LLMSDK as RunAnywhere LLM SDK
    participant CppBackend as C++ Backend

    User->>ChatUI: Toggle Thinking Mode
    ChatUI->>Settings: set thinkingModeEnabled
    Settings->>Settings: persist

    User->>ChatUI: Send prompt
    ChatUI->>LLMVM: startGeneration(prompt)
    LLMVM->>Settings: read thinkingModeEnabled
    LLMVM->>LLMVM: check loadedModelSupportsThinking
    alt thinking enabled & supported
        LLMVM->>LLMSDK: generate(with thinking prefix)
    else
        LLMVM->>LLMSDK: generate(original prompt)
    end
    LLMSDK->>CppBackend: call C++ generate (streaming)
    CppBackend-->>LLMSDK: stream tokens (may include <think> tags)
    LLMSDK->>LLMVM: LLMGenerationResult(text, thinkingContent)
    LLMVM->>ChatUI: update message with stripped display text

sequenceDiagram
    participant User
    participant VoiceUI as Voice Assistant
    participant VoiceVM as VoiceAgentViewModel
    participant VoiceSession as Voice Session (SDK)
    participant LLMSDK as RunAnywhere LLM SDK
    participant TTSSDK as TTS SDK

    User->>VoiceUI: Tap mic
    VoiceUI->>VoiceVM: micAction()
    alt currently speaking
        VoiceVM->>VoiceSession: interruptPlayback()
    else
        VoiceVM->>VoiceSession: startConversation()
    end

    VoiceSession->>LLMSDK: generate(effectivePrompt considering /no_think)
    LLMSDK-->>VoiceSession: result (text, thinkingContent)
    VoiceSession->>TTSSDK: synthesize(cleanedResponse)
    TTSSDK-->>VoiceSession: audio
    VoiceSession->>VoiceUI: play audio / emit events with thinkingContent

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Optimised llamacpp_backend.cpp #330 — C++ changes: DFA-based/incremental UTF‑8 validation and safer stop-window handling (highly related).
[iOS-SDK] [kotlin-SDK] Runanywhere-core consumption in Android and iOS #206 — Model registration and app startup adjustments for model metadata (supportsThinking) and registration changes.
[iOS-SDK] Analytics consolidation - Move all metrics to SDK layer (#69) #116 — Thinking-content extraction and propagation across generation pipelines and SDK/result types.

Suggested labels

ios-sdk, ios-sample

Suggested reviewers

shubhammalhotra28
Siddhesh2377

Poem

🐰 I nibble code with eager hops,
Thinking tags tucked in tiny props.
Streams stay safe, and tokens sing,
Voices hush, then loudly ring.
A rabbit grins — now let thoughts spring!

🚥 Pre-merge checks | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is largely incomplete. It only contains 'will post ss of testing in a bit' followed by screenshots, missing all required template sections like Type of Change, Testing, Platform-Specific Testing, Labels, and Checklist.	Complete the PR description by filling out all required sections: add Type of Change selection, Testing confirmation, Platform-Specific Testing checkboxes, appropriate Labels, and complete the Checklist items. Provide a brief summary of the changes made.
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.52% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Ios demo polish' is vague and generic. It does not clearly indicate what specific improvements or changes are made to the iOS demo.	Use a more descriptive title that specifies the main features added, such as 'iOS demo: Add thinking mode, voice features, and vision improvements' or similar that captures the core changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ios-demo-polish

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-03-11T22:15:51Z

...ywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

+            if !config.thinkingModeEnabled {
+                effectivePrompt = "/no_think\n\(transcription)"
+            } else {
+                effectivePrompt = transcription
            }


/no_think prefix added for non-thinking models

VoiceSessionConfig.thinkingModeEnabled is set to settings.loadedModelSupportsThinking && settings.thinkingModeEnabled in VoiceAgentViewModel. This collapses two independent flags into one boolean, so the session can no longer distinguish "thinking mode disabled for a model that supports thinking" from "model does not support thinking at all."

As a result, every model that doesn't have thinking mode on (including Llama, Phi, etc. that never support thinking) will have /no_think\n prepended to every single voice prompt. This is inconsistent with how LLMViewModel.applyThinkingModePrefix handles the same scenario:

// LLMViewModel — correct: skips prefix entirely for non-thinking models guard loadedModelSupportsThinking else { return prompt }

The fix is to add a separate modelSupportsThinking: Bool field to VoiceSessionConfig and mirror the guard pattern used in LLMViewModel:

if config.modelSupportsThinking && !config.thinkingModeEnabled { effectivePrompt = "/no_think\n\(transcription)" } else { effectivePrompt = transcription }

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift Line: 225-229 Comment: **`/no_think` prefix added for non-thinking models** `VoiceSessionConfig.thinkingModeEnabled` is set to `settings.loadedModelSupportsThinking && settings.thinkingModeEnabled` in `VoiceAgentViewModel`. This collapses two independent flags into one boolean, so the session can no longer distinguish "thinking mode disabled for a model that _supports_ thinking" from "model does not support thinking at all." As a result, every model that doesn't have thinking mode on (including Llama, Phi, etc. that never support thinking) will have `/no_think\n` prepended to every single voice prompt. This is inconsistent with how `LLMViewModel.applyThinkingModePrefix` handles the same scenario: ```swift // LLMViewModel — correct: skips prefix entirely for non-thinking models guard loadedModelSupportsThinking else { return prompt } ``` The fix is to add a separate `modelSupportsThinking: Bool` field to `VoiceSessionConfig` and mirror the guard pattern used in `LLMViewModel`: ```swift if config.modelSupportsThinking && !config.thinkingModeEnabled { effectivePrompt = "/no_think\n\(transcription)" } else { effectivePrompt = transcription } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-11T22:15:53Z

sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift

+        Task {
+            await LLM.shared.destroy()
+            await STT.shared.destroy()
+            await TTS.shared.destroy()
+            await VAD.shared.destroy()
+            await VoiceAgent.shared.destroy()
+            await VLM.shared.destroy()
+        }


Swift destroy() tasks race with C shutdown

The Task { await … .destroy() } is fired-and-forgotten — it is not awaited before the synchronous C-level shutdown (Telemetry.shutdown(), Events.unregister(), etc.) continues. If the destroy() implementations reach into the C layer (e.g. to release handles), they will do so concurrently with or even after the underlying C objects have already been freed, creating a use-after-free.

Either await all destroy calls before proceeding, or ensure the C shutdown is run at the end of the same task:

Task { await LLM.shared.destroy() await STT.shared.destroy() await TTS.shared.destroy() await VAD.shared.destroy() await VoiceAgent.shared.destroy() await VLM.shared.destroy() // Now safe to run C-level shutdown Telemetry.shutdown() Events.unregister() ... }

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift Line: 182-189 Comment: **Swift `destroy()` tasks race with C shutdown** The `Task { await … .destroy() }` is fired-and-forgotten — it is not awaited before the synchronous C-level shutdown (`Telemetry.shutdown()`, `Events.unregister()`, etc.) continues. If the `destroy()` implementations reach into the C layer (e.g. to release handles), they will do so concurrently with or even after the underlying C objects have already been freed, creating a use-after-free. Either `await` all destroy calls before proceeding, or ensure the C shutdown is run at the end of the same task: ```swift Task { await LLM.shared.destroy() await STT.shared.destroy() await TTS.shared.destroy() await VAD.shared.destroy() await VoiceAgent.shared.destroy() await VLM.shared.destroy() // Now safe to run C-level shutdown Telemetry.shutdown() Events.unregister() ... } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-11T22:15:54Z

...runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift

+            thinkingTokens: thinkingContent.map { _ in outputTokens } ?? 0,
            responseTokens: outputTokens


thinkingTokens incorrectly set to total output tokens

thinkingContent.map { _ in outputTokens } sets thinkingTokens to the entire outputTokens value whenever thinking content is present. But outputTokens already represents the total completion tokens (thinking + response combined). Setting thinkingTokens = outputTokens and responseTokens = outputTokens means both fields report the same total, which double-counts every token and inflates any downstream metrics that sum the two.

The thinking token count cannot be known exactly without tracking it during generation (e.g. counting the tokens inside the <think> block), but at a minimum the value should not exceed outputTokens. Consider leaving it as 0 until a reliable count is available, or computing a rough estimate from thinkingContent length:

thinkingTokens: thinkingContent.map { _ in 0 } ?? 0,

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift Line: 110-111 Comment: **`thinkingTokens` incorrectly set to total output tokens** `thinkingContent.map { _ in outputTokens }` sets `thinkingTokens` to the entire `outputTokens` value whenever thinking content is present. But `outputTokens` already represents the _total_ completion tokens (thinking + response combined). Setting `thinkingTokens = outputTokens` and `responseTokens = outputTokens` means both fields report the same total, which double-counts every token and inflates any downstream metrics that sum the two. The thinking token count cannot be known exactly without tracking it during generation (e.g. counting the tokens inside the `<think>` block), but at a minimum the value should not exceed `outputTokens`. Consider leaving it as `0` until a reliable count is available, or computing a rough estimate from `thinkingContent` length: ```swift thinkingTokens: thinkingContent.map { _ in 0 } ?? 0, ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-11T22:15:55Z

...ywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

+
+            emit(.responded(text: cleanedResponse, thinkingContent: thinkingContent))
+
+            // Step 4: Synthesize speech from cleaned response (no think tags spoken)


Missing Step 3 in numbered comment sequence

The inline comments jump from Step 2 to Step 4 with no Step 3, which is confusing for anyone reading through the pipeline:

// Step 1: Transcribe audio // Step 2: Generate LLM response // Step 4: Synthesize speech ← should be Step 3

Suggested change

// Step 4: Synthesize speech from cleaned response (no think tags spoken)

// Step 3: Synthesize speech from cleaned response (no think tags spoken)

Prompt To Fix With AI

This is a comment left during a code review. Path: sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift Line: 239 Comment: **Missing Step 3 in numbered comment sequence** The inline comments jump from Step 2 to Step 4 with no Step 3, which is confusing for anyone reading through the pipeline: ``` // Step 1: Transcribe audio // Step 2: Generate LLM response // Step 4: Synthesize speech ← should be Step 3 ``` ```suggestion // Step 3: Synthesize speech from cleaned response (no think tags spoken) ``` How can I resolve this? If you propose a fix, please make it concise.

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+ToolCalling.swift (1)
208-214: ⚠️ Potential issue | 🟡 Minor

buildFollowUpPrompt loses /no_think prefix positioning in follow-up iterations.

The initial prompt correctly positions /no_think\n at the beginning (lines 179-181). However, buildFollowUpPrompt receives the original prompt parameter and embeds it in the middle of a larger string:
"User: \(prompt)"  // becomes: "User: /no_think\noriginal question..."
If the C++ inference layer requires /no_think at the prompt start, tool-calling loops after the first iteration will not suppress thinking correctly.

Fix: Extract and reapply the /no_think prefix in buildFollowUpPrompt, or pass cleanPrompt to the function instead of prompt.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+ToolCalling.swift
around lines 208 - 214, The follow-up prompt builder (buildFollowUpPrompt) is
embedding the original prompt which may contain the `/no_think` prefix in the
middle of the assembled message, so the C++ layer loses the required leading
`/no_think`. Fix by ensuring the leading `/no_think` is preserved: either strip
and capture the `/no_think` prefix from prompt into cleanPrompt before calling
buildFollowUpPrompt and pass cleanPrompt instead of prompt, or modify
buildFollowUpPrompt to detect and reapply the `/no_think` prefix at the very
start of the returned string; update the call that sets fullPrompt to pass
cleanPrompt (or reapply prefix) and keep references to buildFollowUpPrompt,
fullPrompt, prompt, and cleanPrompt consistent.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift (1)
417-421: ⚠️ Potential issue | 🟡 Minor

Keep a stable scroll target while streaming.

After the first visible token arrives, the "typing" view disappears, but scrollToBottom(proxy:animated:) later in this file still targets "typing" whenever isGenerating is true. That leaves focus/keyboard-triggered scrolls pointing at an ID that no longer exists mid-generation. Use the last message ID once content is non-empty, or keep a persistent bottom anchor.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`
around lines 417 - 421, The scrolling target "typing" disappears once the first
token arrives, so change scroll target logic used by
scrollToBottom(proxy:animated:) to choose a stable anchor: when
viewModel.isGenerating is true but viewModel.messages.last?.content.isEmpty ==
false, use the last message's id (e.g., viewModel.messages.last?.id) instead of
"typing", or create a persistent bottom anchor view ID (e.g., "bottom-anchor")
and always scroll to that; update the TypingIndicatorView id usage
(id("typing")) and the code that calls scrollToBottom(proxy:animated:) to select
the last message ID when content is non-empty or fall back to the persistent
"bottom-anchor" so scroll targets never point to a removed view.

🧹 Nitpick comments (6)

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift (1)
136-136: Consider providing feedback when a load is already in progress.

The silent early return prevents race conditions, but users may be confused if they tap a model and nothing visibly happens. Consider either:

Setting isLoading = true at the start of selectModel (similar to loadModelsFromRegistry), so the UI can show a loading indicator.

Logging or providing minimal feedback that a load is already in progress.

This is minor since the UI may already handle this elsewhere.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift`
at line 136, The early return in selectModel guarded by isLoadingModel makes
taps silently no-op; update selectModel (in ModelListViewModel) to give feedback
by either setting the same loading state used in loadModelsFromRegistry (e.g.,
set isLoading = true / isLoadingModel = true at the start of selectModel) so the
UI shows a spinner, or log/emit a minimal message when isLoadingModel is true to
indicate a load is in progress; ensure you clear the loading flag on
completion/failure just like loadModelsFromRegistry to avoid leaving the view
stuck.
sdk/runanywhere-commons/src/features/llm/llm_component.cpp (1)
772-788: Lock-free cancel is intentional but has a subtle race window.

The design correctly avoids taking the mutex so that cancel can interrupt an in-flight generation that holds the lock. However, accessing component->lifecycle (line 780) without the mutex could race with rac_llm_component_destroy or rac_llm_component_unload if called concurrently.

This is likely acceptable since calling cancel() during component destruction would be a usage error, but consider documenting this constraint in the header or adding an assertion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/features/llm/llm_component.cpp` around lines 772
- 788, The cancel implementation reads component->lifecycle without holding the
component mutex which can race with destruction/unload; to fix, either document
the threading constraint in the public header (declare that
rac_llm_component_cancel must not be called concurrently with
rac_llm_component_destroy or rac_llm_component_unload) or add a defensive
check/assert at the top of rac_llm_component_cancel (e.g.,
assert(component->lifecycle && "cannot cancel during destroy/unload") or return
an error if lifecycle is null) before calling rac_lifecycle_get_service and
rac_llm_cancel, referencing rac_llm_component_cancel, component->lifecycle,
rac_llm_component_destroy, rac_llm_component_unload and rac_llm_cancel.
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp (1)
983-1041: Consider extracting duplicated UTF-8 handling logic.

The UTF-8 validation loop, scanner_state usage, and safe-stop trimming are nearly identical to generate_stream (lines 682-752). Additionally, STOP_SEQUENCES and MAX_STOP_LEN are defined twice (lines 665-674 and 966-975).

Extracting this into a helper class or shared method would reduce duplication and ensure both code paths stay synchronized.
♻️ Suggested approach

Consider extracting to a file-level helper or inner class:
namespace {
static const std::vector<std::string> kStopSequences = {
    "<|im_end|>", "<|eot_id|>", "</s>", "<|end|>", "<|endoftext|>",
    "\n\nUser:", "\n\nHuman:",
};

static const size_t kMaxStopLen = []{
    size_t m = 0;
    for (const auto& s : kStopSequences) m = std::max(m, s.size());
    return m;
}();
} // namespace
The UTF-8 buffering and stop-sequence detection could be encapsulated in a small struct that both generation paths use.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp` around
lines 983 - 1041, Extract the duplicated UTF-8 buffering and stop-sequence logic
into a shared helper (e.g., a static anonymous-namespace constants
kStopSequences and kMaxStopLen and a small Utf8StopDetector/Utf8StopBuffer
struct) and replace the inline code in both the streaming loop and
generate_stream with calls to that helper; the helper should own Utf8State,
partial buffer, stop_window and expose methods to feed new bytes (from
common_token_to_piece/new_token_chars), return any fully-valid UTF‑8 chunk,
detect/trim stop sequences (mirroring the current found_stop_pos logic), and
produce safe trimmed output for appending to generated_text so you can remove
duplicated STOP_SEQUENCES, MAX_STOP_LEN, scanner_state, partial_utf8_buffer and
stop_window usages in the loops.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift (1)
103-103: Remove debug print statement.

This print statement will log to the console in production builds. Consider using os.Logger for debug builds only, or remove it entirely.
🧹 Proposed fix
-                    print("Calculator received args: \(args), using expression: '\(expression)'")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`
at line 103, Remove the debug print statement that logs "Calculator received
args" in ToolSettingsView.swift; replace it either by removing the line entirely
or by switching to a proper logger (os.Logger) and gating it behind a debug-only
build check (e.g., `#if` DEBUG) so it does not emit to production consoles; locate
the occurrence by searching for the string "Calculator received args" or the
print(...) call and update the surrounding function accordingly.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift (1)
446-450: Consider using named parameters for enum destructuring.

The positional destructuring works correctly, but if VoiceSessionEvent associated value order changes, this code would silently receive misaligned data. Using named parameters would be more resilient:
🧹 Suggested improvement
-        case .responded(let text, _): assistantResponse = text
+        case .responded(text: let text, thinkingContent: _): assistantResponse = text
         case .speaking: sessionState = .speaking; currentStatus = "Speaking..."
-        case let .turnCompleted(transcript, response, _, _):
+        case .turnCompleted(transcript: let transcript, response: let response, thinkingContent: _, audio: _):
             currentTranscript = transcript; assistantResponse = response
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`
around lines 446 - 450, The pattern match on VoiceSessionEvent uses positional
associated values which is fragile; update the case patterns to use named
parameters so each value is matched by label (e.g., change the case
.turnCompleted(transcript, response, _, _) to a labeled form like case let
.turnCompleted(transcript: transcript, response: response, ...)) and similarly
label .responded and other cases if they have associated value labels; then
assign currentTranscript, assistantResponse, sessionState, and currentStatus
from those named bindings to ensure order changes in VoiceSessionEvent won’t
break the logic.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift (1)
101-127: Prefer Swift-native notification observation here.

This reintroduces selector-based observer management inside a Combine-backed @MainActor view model. Using a NotificationCenter publisher or async notifications would keep delivery/cancellation in the same Swift concurrency model and remove the manual removeObserver path.

As per coding guidelines "Use the latest Swift 6 APIs always".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`
around lines 101 - 127, The view model is using selector-based
NotificationCenter observers (subscribeToModelNotifications,
handleModelLoaded(_:), handleModelUnloaded, and deinit removeObserver) which
reintroduces non-Swift-concurrency observer management; replace them with
Swift-native observation by removing addObserver/removeObserver and instead
subscribe via NotificationCenter publishers or async notifications and store the
subscriptions in a Combine cancellables Set (e.g., a private var cancellables =
Set<AnyCancellable>() or use Task/NotificationCenter.notifications(named:) with
Task cancellation). Specifically, remove the selector-based registrations in
subscribeToModelNotifications, replace them with
NotificationCenter.default.publisher(for: Notification.Name("ModelLoaded")) and
.sink to update loadedModelSupportsThinking (and a separate publisher for
"ModelUnloaded" to set it false), store the AnyCancellable values in
cancellables, and drop the manual removeObserver call in deinit so cancellations
happen via the cancellables lifecycle (or cancel the Tasks in deinit if using
async sequences).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel`+Generation.swift:
- Around line 25-29: The loop is issuing UI updates and notifications on every
token even when Self.stripThinkTags(from:) yields the same displayText; change
the streaming loop (where fullResponse, displayText are computed and
updateMessageContent(at:content:) and NotificationCenter.default.post are
called) to track the last-displayed text (e.g., a local prevDisplayText) and
only call updateMessageContent(at:content:) and post the scroll notification
when displayText != prevDisplayText, updating prevDisplayText after a successful
update; this avoids no-op main-thread churn while still showing visible changes.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 34-35: Remove the duplicate `@AppStorage`("thinkingModeEnabled")
property in ChatInterfaceView and use the shared
Settings/ToolSettingsViewModel's thinkingModeEnabled instead; replace any uses
or bindings of the local thinkingModeEnabled with
toolSettingsViewModel.thinkingModeEnabled (including the badge binding and other
occurrences mentioned around the 502-518 region) so toggles update
SettingsViewModel.shared and avoid the second source of truth.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`:
- Around line 122-128: The NSExpression(format:) call can raise an Objective-C
exception for malformed syntax (e.g., "((", "1 + + 2"), so replace the direct
NSExpression(format: cleanedExpression) usage with a safe evaluator: implement
an ObjC bridging helper (e.g., evaluateExpressionSafely(_:)) that wraps
NSExpression(format:) and expressionValue(with:context:) inside `@try/`@catch and
returns nil on exception; then call that helper from ToolSettingsView (use
cleanedExpression and expression variables) and only build the return dictionary
when the helper returns a valid NSNumber, otherwise handle it as an invalid
expression (e.g., skip returning a result or surface an error).

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift`:
- Around line 56-63: The scenePhase handler can restart the camera even when
this view is offscreen—add a visibility guard: introduce a Bool property (e.g.,
isVisible) updated in onAppear and onDisappear (or via SwiftUI .task/.onChange
for view visibility), and in the .onChange(of: scenePhase) closure check
isVisible before calling setupCameraIfNeeded(); keep the existing calls to
viewModel.stopAutoStreaming() and viewModel.stopCamera() when moving to
background/inactive, but avoid calling setupCameraIfNeeded() on .active unless
isVisible is true.

In `@sdk/runanywhere-commons/src/features/llm/tool_calling.cpp`:
- Around line 718-722: The fallback branch currently always quotes the scalar in
flat_args, turning raw JSON literals into strings; update the logic in
tool_calling.cpp (the branch using kval_is_obj, kval, escaped_key,
escape_json_string, flat_args) to detect actual JSON literals and emit them
unquoted. Add or call a helper (e.g., is_json_literal(kval)) that returns true
for numbers, true/false, null, and for array/object text (or reuse kval_is_obj
for structures), and only run escape_json_string and wrap in quotes when the
value is a real string; otherwise append the raw kval as-is to preserve typed
literals and follow the structured-types guideline. Ensure the helper is
documented/implemented in the same translation unit and used in the same
conditional chain that currently handles kval and kval_is_obj.
- Around line 401-425: The scalar-value branch currently treats a leading '[' as
a scalar and stops at the first ',' or ']', truncating arrays; modify the
scanner that assigns *out_value and *out_is_object so that if json_obj[pos] ==
'[' you perform a matching-bracket scan (similar to how '{' is handled) to find
the closing ']' including nested brackets and strings, allocate the full
substring (including brackets), set *out_is_object = true (or the same
structured flag used for objects), and return true; otherwise keep the existing
scalar logic for true scalars. Ensure you use the same allocation/memory
conventions as the existing object path and reuse variable names json_obj, pos,
val_start/val_end, *out_value and *out_is_object to locate the change.
- Around line 705-711: The loop is currently marking is_tool_key true for any
alias in TOOL_NAME_KEYS and skipping all such keys, causing legitimate args like
"name" to be dropped; modify the logic in tool_calling.cpp (around the loop over
TOOL_NAME_KEYS) to record which specific alias matched (e.g., matched_key or
matched_index) when comparing k against TOOL_NAME_KEYS, then after the loop only
skip the map entry if k equals that matched key (or index matches matched_index)
instead of skipping whenever any alias exists; update any uses of is_tool_key to
use the matched_key check so only the actual tool-name key is excluded.

In `@sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift`:
- Around line 182-189: The shutdown() function currently spawns a
fire-and-forget Task which returns before native components finish tearing down;
change shutdown() to be async (remove the Task wrapper), sequentially await each
component's destroy() call (LLM.shared.destroy(), STT.shared.destroy(),
TTS.shared.destroy(), VAD.shared.destroy(), VoiceAgent.shared.destroy(),
VLM.shared.destroy()) so teardown completes before proceeding, and only after
all awaits clear _isInitialized and _servicesInitialized; then update the caller
RunAnywhere.reset() to await CppBridge.shutdown() and mark reset() async.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift:
- Around line 100-110: The LLMGenerationResult currently misreports
thinkingTokens (non-streaming sets it to outputTokens when any thinkingContent
exists, while streaming returns 0), causing inconsistent and double-counted
metrics; change both code paths that construct LLMGenerationResult (the
non-streaming return here and the similar block around the 450-460 range) to set
thinkingTokens to 0 (and ensure responseTokens/outputTokens remain unchanged)
until the C++ layer exposes a real thinking-token metric, so both streaming and
non-streaming results return a consistent zero-valued thinkingTokens.
- Around line 312-335: The current ThinkingContentParser.extract(from:) only
removes the first <think>...</think> pair; update extract to scan the entire
input (e.g., with a while loop or regex) to find and remove all complete
<think>...</think> blocks, concatenating their inner contents (trimmed) into a
single thinking string, and rebuild response text from the remaining parts; also
strip any stray "<think>" or "</think>" leftovers after extraction so incomplete
fragments do not appear in the returned text. Ensure you modify
ThinkingContentParser.extract to collect all matches, remove them from the
source, and return (text: cleanedResponseText, thinking: combinedThinkingOrNil).

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere`+TTS.swift:
- Around line 189-190: TTSStreamContext currently marked `@unchecked` Sendable
with an unprotected mutable totalData that is mutated from the C callback;
change TTSStreamContext to implement Sendable safely by adding an
OSAllocatedUnfairLock (matching the STT bridge pattern) and use it to guard all
accesses/mutations of totalData (wrap mutations in the C callback and the read
in the finalization path where
Unmanaged<TTSStreamContext>.fromOpaque(...).takeRetainedValue() is used). Mirror
the STT implementation (use the same lock property name and lock/unlock
placement) so totalData is only read/written while holding the
OSAllocatedUnfairLock.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 224-229: The logic for adding the "/no_think" prefix in
RunAnywhere+VoiceSession.swift currently uses only config.thinkingModeEnabled;
align it with the pattern in RAGViewModel by ensuring the decision also
considers whether the model supports thinking (i.e., the same combined condition
used when building the config elsewhere). Update the construction or usage so
that effectivePrompt is prefixed with "/no_think" only when thinkingMode is
disabled AND the model supports thinking was considered when the config was
created (mirror the check used in RAGViewModel and VoiceAgentViewModel where
thinkingModeEnabled is set as settings.loadedModelSupportsThinking &&
settings.thinkingModeEnabled).
- Around line 258-263: The snippet currently emits .turnCompleted with variables
transcription, cleanedResponse, thinkingContent, and synthesizedAudio even when
an error occurs in the preceding do block; change the control flow so
.turnCompleted is only emitted when the do block completes successfully (i.e.,
move the emit(.turnCompleted(...)) into the success path of the do block or
guard for a successful result before emitting), and ensure the catch path either
emits an error-specific event (or returns/throws) instead of fallthrough
emission; reference the existing symbols transcription, cleanedResponse,
thinkingContent, synthesizedAudio and the emit(.turnCompleted(...)) call when
making the change.

---

Outside diff comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift`:
- Around line 417-421: The scrolling target "typing" disappears once the first
token arrives, so change scroll target logic used by
scrollToBottom(proxy:animated:) to choose a stable anchor: when
viewModel.isGenerating is true but viewModel.messages.last?.content.isEmpty ==
false, use the last message's id (e.g., viewModel.messages.last?.id) instead of
"typing", or create a persistent bottom anchor view ID (e.g., "bottom-anchor")
and always scroll to that; update the TypingIndicatorView id usage
(id("typing")) and the code that calls scrollToBottom(proxy:animated:) to select
the last message ID when content is non-empty or fall back to the persistent
"bottom-anchor" so scroll targets never point to a removed view.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+ToolCalling.swift:
- Around line 208-214: The follow-up prompt builder (buildFollowUpPrompt) is
embedding the original prompt which may contain the `/no_think` prefix in the
middle of the assembled message, so the C++ layer loses the required leading
`/no_think`. Fix by ensuring the leading `/no_think` is preserved: either strip
and capture the `/no_think` prefix from prompt into cleanPrompt before calling
buildFollowUpPrompt and pass cleanPrompt instead of prompt, or modify
buildFollowUpPrompt to detect and reapply the `/no_think` prefix at the very
start of the returned string; update the call that sets fullPrompt to pass
cleanPrompt (or reapply prefix) and keep references to buildFollowUpPrompt,
fullPrompt, prompt, and cleanPrompt consistent.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift`:
- Line 136: The early return in selectModel guarded by isLoadingModel makes taps
silently no-op; update selectModel (in ModelListViewModel) to give feedback by
either setting the same loading state used in loadModelsFromRegistry (e.g., set
isLoading = true / isLoadingModel = true at the start of selectModel) so the UI
shows a spinner, or log/emit a minimal message when isLoadingModel is true to
indicate a load is in progress; ensure you clear the loading flag on
completion/failure just like loadModelsFromRegistry to avoid leaving the view
stuck.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`:
- Around line 101-127: The view model is using selector-based NotificationCenter
observers (subscribeToModelNotifications, handleModelLoaded(_:),
handleModelUnloaded, and deinit removeObserver) which reintroduces
non-Swift-concurrency observer management; replace them with Swift-native
observation by removing addObserver/removeObserver and instead subscribe via
NotificationCenter publishers or async notifications and store the subscriptions
in a Combine cancellables Set (e.g., a private var cancellables =
Set<AnyCancellable>() or use Task/NotificationCenter.notifications(named:) with
Task cancellation). Specifically, remove the selector-based registrations in
subscribeToModelNotifications, replace them with
NotificationCenter.default.publisher(for: Notification.Name("ModelLoaded")) and
.sink to update loadedModelSupportsThinking (and a separate publisher for
"ModelUnloaded" to set it false), store the AnyCancellable values in
cancellables, and drop the manual removeObserver call in deinit so cancellations
happen via the cancellables lifecycle (or cancel the Tasks in deinit if using
async sequences).

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift`:
- Line 103: Remove the debug print statement that logs "Calculator received
args" in ToolSettingsView.swift; replace it either by removing the line entirely
or by switching to a proper logger (os.Logger) and gating it behind a debug-only
build check (e.g., `#if` DEBUG) so it does not emit to production consoles; locate
the occurrence by searching for the string "Calculator received args" or the
print(...) call and update the surrounding function accordingly.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift`:
- Around line 446-450: The pattern match on VoiceSessionEvent uses positional
associated values which is fragile; update the case patterns to use named
parameters so each value is matched by label (e.g., change the case
.turnCompleted(transcript, response, _, _) to a labeled form like case let
.turnCompleted(transcript: transcript, response: response, ...)) and similarly
label .responded and other cases if they have associated value labels; then
assign currentTranscript, assistantResponse, sessionState, and currentStatus
from those named bindings to ensure order changes in VoiceSessionEvent won’t
break the logic.

In `@sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp`:
- Around line 983-1041: Extract the duplicated UTF-8 buffering and stop-sequence
logic into a shared helper (e.g., a static anonymous-namespace constants
kStopSequences and kMaxStopLen and a small Utf8StopDetector/Utf8StopBuffer
struct) and replace the inline code in both the streaming loop and
generate_stream with calls to that helper; the helper should own Utf8State,
partial buffer, stop_window and expose methods to feed new bytes (from
common_token_to_piece/new_token_chars), return any fully-valid UTF‑8 chunk,
detect/trim stop sequences (mirroring the current found_stop_pos logic), and
produce safe trimmed output for appending to generated_text so you can remove
duplicated STOP_SEQUENCES, MAX_STOP_LEN, scanner_state, partial_utf8_buffer and
stop_window usages in the loops.

In `@sdk/runanywhere-commons/src/features/llm/llm_component.cpp`:
- Around line 772-788: The cancel implementation reads component->lifecycle
without holding the component mutex which can race with destruction/unload; to
fix, either document the threading constraint in the public header (declare that
rac_llm_component_cancel must not be called concurrently with
rac_llm_component_destroy or rac_llm_component_unload) or add a defensive
check/assert at the top of rac_llm_component_cancel (e.g.,
assert(component->lifecycle && "cannot cancel during destroy/unload") or return
an error if lifecycle is null) before calling rac_lifecycle_get_service and
rac_llm_cancel, referencing rac_llm_component_cancel, component->lifecycle,
rac_llm_component_destroy, rac_llm_component_unload and rac_llm_cancel.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f15418f-6046-4e3d-9175-cb636f3efb2f

📥 Commits

Reviewing files that changed from the base of the PR and between cfb7784 and 13dc851.

📒 Files selected for processing (27)

examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+Events.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+Generation.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+ModelManagement.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+ToolCalling.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/ViewModels/RAGViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/CombinedSettingsView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift
sdk/runanywhere-commons/src/backends/llamacpp/llamacpp_backend.cpp
sdk/runanywhere-commons/src/features/llm/llm_component.cpp
sdk/runanywhere-commons/src/features/llm/tool_calling.cpp
sdk/runanywhere-swift/Sources/RunAnywhere/Features/TTS/Services/AudioPlaybackManager.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Foundation/Bridge/CppBridge.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Infrastructure/Download/Utilities/ArchiveUtility.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+ToolCalling.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceAgent.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/VoiceAgentTypes.swift

coderabbitai · 2026-03-11T22:20:37Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel+Generation.swift

        for try await token in stream {
            fullResponse += token
-            await updateMessageContent(at: messageIndex, content: fullResponse)
+            let displayText = Self.stripThinkTags(from: fullResponse)
+            await updateMessageContent(at: messageIndex, content: displayText)
            NotificationCenter.default.post(


⚠️ Potential issue | 🟠 Major

Skip no-op UI updates while the model is still inside <think>.

stripThinkTags can keep displayText unchanged for many hidden tokens, but this loop still rebuilds the message and posts a scroll notification on every token. On long thinking traces that becomes avoidable main-thread churn.

Suggested simplification

func generateStreamingResponse( prompt: String, options: LLMGenerationOptions, messageIndex: Int ) async throws { var fullResponse = "" + var lastDisplayText = "" let streamingResult = try await RunAnywhere.generateStream(prompt, options: options) let stream = streamingResult.stream let metricsTask = streamingResult.result for try await token in stream { fullResponse += token let displayText = Self.stripThinkTags(from: fullResponse) + guard displayText != lastDisplayText else { continue } + lastDisplayText = displayText await updateMessageContent(at: messageIndex, content: displayText) NotificationCenter.default.post( name: Notification.Name("MessageContentUpdated"), object: nil )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for try await token in stream {

fullResponse += token

await updateMessageContent(at: messageIndex, content: fullResponse)

let displayText = Self.stripThinkTags(from: fullResponse)

await updateMessageContent(at: messageIndex, content: displayText)

NotificationCenter.default.post(

var fullResponse = ""

var lastDisplayText = ""

let streamingResult = try await RunAnywhere.generateStream(prompt, options: options)

let stream = streamingResult.stream

let metricsTask = streamingResult.result

for try await token in stream {

fullResponse += token

let displayText = Self.stripThinkTags(from: fullResponse)

guard displayText != lastDisplayText else { continue }

lastDisplayText = displayText

await updateMessageContent(at: messageIndex, content: displayText)

NotificationCenter.default.post(

name: Notification.Name("MessageContentUpdated"),

object: nil

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/ViewModels/LLMViewModel`+Generation.swift around lines 25 - 29, The loop is issuing UI updates and notifications on every token even when Self.stripThinkTags(from:) yields the same displayText; change the streaming loop (where fullResponse, displayText are computed and updateMessageContent(at:content:) and NotificationCenter.default.post are called) to track the last-displayed text (e.g., a local prevDisplayText) and only call updateMessageContent(at:content:) and post the scroll notification when displayText != prevDisplayText, updating prevDisplayText after a successful update; this avoids no-op main-thread churn while still showing visible changes.

coderabbitai · 2026-03-11T22:20:37Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift

+    @ObservedObject private var toolSettingsViewModel = ToolSettingsViewModel.shared
+    @AppStorage("thinkingModeEnabled") private var thinkingModeEnabled = false


⚠️ Potential issue | 🟠 Major

Avoid a second source of truth for thinking mode.

SettingsViewModel already owns thinkingModeEnabled and persists it. Toggling a separate @AppStorage copy here can update UserDefaults without updating SettingsViewModel.shared.thinkingModeEnabled, so generation code that reads the shared settings can stay stale for the rest of the session. Bind the badge to the shared settings view model instead of duplicating the state.

Suggested direction

- `@ObservedObject` private var toolSettingsViewModel = ToolSettingsViewModel.shared - `@AppStorage`("thinkingModeEnabled") private var thinkingModeEnabled = false + `@ObservedObject` private var toolSettingsViewModel = ToolSettingsViewModel.shared + `@ObservedObject` private var settingsViewModel = SettingsViewModel.shared ... - if thinkingModeEnabled && viewModel.loadedModelSupportsThinking { + if settingsViewModel.thinkingModeEnabled && viewModel.loadedModelSupportsThinking { thinkingModeBadge } ... - .padding(.top, ((thinkingModeEnabled && viewModel.loadedModelSupportsThinking) || viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0) + .padding(.top, ((settingsViewModel.thinkingModeEnabled && viewModel.loadedModelSupportsThinking) || viewModel.useToolCalling || !viewModel.loraAdapters.isEmpty || hasModelSelected) ? 8 : 0) ... var thinkingModeBadge: some View { Button { - thinkingModeEnabled.toggle() + settingsViewModel.thinkingModeEnabled.toggle() } label: {

Also applies to: 502-518

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Views/ChatInterfaceView.swift` around lines 34 - 35, Remove the duplicate `@AppStorage`("thinkingModeEnabled") property in ChatInterfaceView and use the shared Settings/ToolSettingsViewModel's thinkingModeEnabled instead; replace any uses or bindings of the local thinkingModeEnabled with toolSettingsViewModel.thinkingModeEnabled (including the badge binding and other occurrences mentioned around the 502-518 region) so toggles update SettingsViewModel.shared and avoid the second source of truth.

coderabbitai · 2026-03-11T22:20:37Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift

+                    let exp = NSExpression(format: cleanedExpression)
+                    if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber {
+                        return [
+                            "result": .number(result.doubleValue),
+                            "expression": .string(expression)
+                        ]
                    }


⚠️ Potential issue | 🟠 Major

NSExpression(format:) can crash on malformed expressions.

The character validation (lines 112-120) prevents injection but doesn't validate syntax. Expressions like "1 + + 2", "1 /", or "((" pass character validation but will crash NSExpression(format:).

🛡️ Proposed fix: wrap NSExpression in a safe evaluation

- let exp = NSExpression(format: cleanedExpression) - if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber { - return [ - "result": .number(result.doubleValue), - "expression": .string(expression) - ] + do { + let exp = NSExpression(format: cleanedExpression) + if let result = exp.expressionValue(with: nil, context: nil) as? NSNumber { + return [ + "result": .number(result.doubleValue), + "expression": .string(expression) + ] + } + } catch { + // NSExpression throws on invalid syntax }

Note: NSExpression(format:) doesn't throw but can raise an Objective-C exception. Consider using NSExpression(format:argumentArray:) with proper validation or a dedicated math parser library for robustness.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/ToolSettingsView.swift` around lines 122 - 128, The NSExpression(format:) call can raise an Objective-C exception for malformed syntax (e.g., "((", "1 + + 2"), so replace the direct NSExpression(format: cleanedExpression) usage with a safe evaluator: implement an ObjC bridging helper (e.g., evaluateExpressionSafely(_:)) that wraps NSExpression(format:) and expressionValue(with:context:) inside `@try/`@catch and returns nil on exception; then call that helper from ToolSettingsView (use cleanedExpression and expression variables) and only build the return dictionary when the helper returns a valid NSNumber, otherwise handle it as an invalid expression (e.g., skip returning a result or surface an error).

coderabbitai · 2026-03-11T22:20:37Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift

+        .onChange(of: scenePhase) { _, newPhase in
+            if newPhase == .background || newPhase == .inactive {
+                viewModel.stopAutoStreaming()
+                viewModel.stopCamera()
+            } else if newPhase == .active {
+                setupCameraIfNeeded()
+            }
+        }


⚠️ Potential issue | 🟡 Minor

Potential camera restart when view is not visible.

When the user navigates away, .onDisappear stops the camera. However, if the view remains in the navigation stack's hierarchy, .onChange(of: scenePhase) may still fire when the app becomes active, causing setupCameraIfNeeded() to restart the camera even though the user is on a different screen.

Consider tracking visibility state to guard against this:

🛠️ Proposed fix using visibility tracking

`@State` private var showingPhotos = false `@State` private var selectedPhoto: PhotosPickerItem? +@State private var isViewVisible = false `@Environment`(\.scenePhase) private var scenePhase

-.onAppear { setupCameraIfNeeded() } +.onAppear { + isViewVisible = true + setupCameraIfNeeded() +} .onDisappear { + isViewVisible = false viewModel.stopAutoStreaming() viewModel.stopCamera() } .onChange(of: scenePhase) { _, newPhase in if newPhase == .background || newPhase == .inactive { viewModel.stopAutoStreaming() viewModel.stopCamera() - } else if newPhase == .active { + } else if newPhase == .active && isViewVisible { setupCameraIfNeeded() } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Vision/VLMCameraView.swift` around lines 56 - 63, The scenePhase handler can restart the camera even when this view is offscreen—add a visibility guard: introduce a Bool property (e.g., isVisible) updated in onAppear and onDisappear (or via SwiftUI .task/.onChange for view visibility), and in the .onChange(of: scenePhase) closure check isVisible before calling setupCameraIfNeeded(); keep the existing calls to viewModel.stopAutoStreaming() and viewModel.stopCamera() when moving to background/inactive, but avoid calling setupCameraIfNeeded() on .active unless isVisible is true.

coderabbitai · 2026-03-11T22:20:37Z

sdk/runanywhere-commons/src/features/llm/tool_calling.cpp

+                                } else {
+                                    // Scalar value (number, boolean, null)
+                                    // Read until comma, closing brace, or whitespace
+                                    size_t val_start = pos;
+                                    size_t val_end = pos;
+                                    while (val_end < len && json_obj[val_end] != ',' &&
+                                           json_obj[val_end] != '}' && json_obj[val_end] != ']' &&
+                                           json_obj[val_end] != '\n') {
+                                        val_end++;
+                                    }
+                                    // Trim trailing whitespace
+                                    while (val_end > val_start &&
+                                           (json_obj[val_end - 1] == ' ' || json_obj[val_end - 1] == '\t')) {
+                                        val_end--;
+                                    }
+                                    if (val_end > val_start) {
+                                        size_t val_len = val_end - val_start;
+                                        *out_value = static_cast<char*>(malloc(val_len + 1));
+                                        if (*out_value) {
+                                            memcpy(*out_value, json_obj + val_start, val_len);
+                                            (*out_value)[val_len] = '\0';
+                                        }
+                                        *out_is_object = false;
+                                        return true;
+                                    }


⚠️ Potential issue | 🔴 Critical

Array-valued arguments are truncated in the new scalar path.

Line 401 sends [ into the scalar branch, but that scanner stops on the first , or ]. Inputs like {"items":[1,2]} get extracted as "[1", so any RAC_TOOL_PARAM_ARRAY argument is corrupted before parsing completes. Handle [ as structured JSON with a matching-bracket scan instead of treating it as a scalar literal.

As per coding guidelines, sdk/runanywhere-commons/**/*.{c,cpp,h,hpp}: "Always use structured types, never use strings directly for consistency and scalability"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-commons/src/features/llm/tool_calling.cpp` around lines 401 - 425, The scalar-value branch currently treats a leading '[' as a scalar and stops at the first ',' or ']', truncating arrays; modify the scanner that assigns *out_value and *out_is_object so that if json_obj[pos] == '[' you perform a matching-bracket scan (similar to how '{' is handled) to find the closing ']' including nested brackets and strings, allocate the full substring (including brackets), set *out_is_object = true (or the same structured flag used for objects), and return true; otherwise keep the existing scalar logic for true scalars. Ensure you use the same allocation/memory conventions as the existing object path and reuse variable names json_obj, pos, val_start/val_end, *out_value and *out_is_object to locate the change.

coderabbitai · 2026-03-11T22:20:37Z

...runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift

        return LLMGenerationResult(
            text: generatedText,
-            thinkingContent: nil,
+            thinkingContent: thinkingContent,
            inputTokens: inputTokens,
            tokensUsed: outputTokens,
            modelUsed: modelId,
            latencyMs: totalTimeMs,
            framework: "llamacpp",
            tokensPerSecond: tokensPerSecond,
            timeToFirstTokenMs: nil,
-            thinkingTokens: 0,
+            thinkingTokens: thinkingContent.map { _ in outputTokens } ?? 0,


⚠️ Potential issue | 🟠 Major

thinkingTokens is misreported and inconsistent across paths.

The non-streaming result uses the entire completion count as thinkingTokens whenever any <think> block exists, while the streaming result always returns 0. Because responseTokens is also set to outputTokens, this can double-count the same completion in analytics. Please return a consistent value in both paths until the C++ layer exposes a dedicated thinking-token metric.

Also applies to: 450-460

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift around lines 100 - 110, The LLMGenerationResult currently misreports thinkingTokens (non-streaming sets it to outputTokens when any thinkingContent exists, while streaming returns 0), causing inconsistent and double-counted metrics; change both code paths that construct LLMGenerationResult (the non-streaming return here and the similar block around the 450-460 range) to set thinkingTokens to 0 (and ensure responseTokens/outputTokens remain unchanged) until the C++ layer exposes a real thinking-token metric, so both streaming and non-streaming results return a consistent zero-valued thinkingTokens.

coderabbitai · 2026-03-11T22:20:37Z

...runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere+TextGeneration.swift

+enum ThinkingContentParser {
+    /// Extracts `<think>...</think>` content from generated text.
+    /// - Returns: Tuple of (responseText, thinkingContent). If no tags found, responseText = original text, thinkingContent = nil.
+    static func extract(from text: String) -> (text: String, thinking: String?) {
+        guard let startRange = text.range(of: "<think>"),
+              let endRange = text.range(of: "</think>"),
+              startRange.upperBound <= endRange.lowerBound else {
+            return (text: text, thinking: nil)
+        }
+        let thinkingContent = String(text[startRange.upperBound..<endRange.lowerBound])
+            .trimmingCharacters(in: .whitespacesAndNewlines)
+        // Include any text before <think> and after </think>
+        let textBefore = String(text[..<startRange.lowerBound])
+            .trimmingCharacters(in: .whitespacesAndNewlines)
+        let textAfter = String(text[endRange.upperBound...])
+            .trimmingCharacters(in: .whitespacesAndNewlines)
+        let responseText = [textBefore, textAfter]
+            .filter { !$0.isEmpty }
+            .joined(separator: "\n")
+        return (
+            text: responseText,
+            thinking: thinkingContent.isEmpty ? nil : thinkingContent
+        )
+    }


⚠️ Potential issue | 🟠 Major

The parser only removes the first <think> block.

extract(from:) stops after one closing tag, so additional or trailing incomplete think sections remain in text. The app-side LLMViewModel.stripThinkTags already handles repeated/incomplete blocks, which means SDK results can still surface raw think markup in non-streaming and final streaming results. Please make this parser consume all think blocks or centralize one shared implementation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/LLM/RunAnywhere`+TextGeneration.swift around lines 312 - 335, The current ThinkingContentParser.extract(from:) only removes the first <think>...</think> pair; update extract to scan the entire input (e.g., with a while loop or regex) to find and remove all complete <think>...</think> blocks, concatenating their inner contents (trimmed) into a single thinking string, and rebuild response text from the remaining parts; also strip any stray "<think>" or "</think>" leftovers after extraction so incomplete fragments do not appear in the returned text. Ensure you modify ThinkingContentParser.extract to collect all matches, remove them from the source, and return (text: cleanedResponseText, thinking: combinedThinkingOrNil).

coderabbitai · 2026-03-11T22:20:37Z

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift

+        let finalContext = Unmanaged<TTSStreamContext>.fromOpaque(contextPtr).takeRetainedValue()
+        let totalAudioData = finalContext.totalData


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the native TTS streaming implementation for callback lifetime/threading. fd 'rac_tts_component' -HI | while read -r file; do rg -n -C5 'rac_tts_component_synthesize_stream|user_data|callback|dispatch_async|std::thread|async|queue|worker' "$file" done # Compare Swift-side retain/release patterns used by other streaming bridges. rg -n -C4 'passRetained|takeRetainedValue|takeUnretainedValue' \ sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 21787

🏁 Script executed:

# Check if TTSStreamContext and STTStreamingContext have similar definitions and usage rg -n -A 10 'class TTSStreamContext|struct TTSStreamContext' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/ # Check STT context definition for comparison rg -n -A 10 'class STTStreamingContext|struct STTStreamingContext' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/ # Check if there are any comments in TTS about synchronous behavior rg -n -B 5 -A 5 'rac_tts_component_synthesize_stream' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift # Search for any TODOs or FIXMEs related to threading/concurrency in TTS rg -n 'TODO|FIXME|SAFETY|thread|async|race' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere+TTS.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 3399

🏁 Script executed:

# Check how STT uses finalText in callbacks rg -n -B 2 -A 2 'finalText' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift | head -40 # Check if there are any comments explaining the synchronization choice in STT rg -n -B 5 -A 0 'OSAllocatedUnfairLock' sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1068

Protect totalData with a Swift 6 synchronization primitive; mirror the STT implementation.

Line 311 declares TTSStreamContext as @unchecked Sendable with an unguarded mutable totalData, which is mutated directly in the C callback at line 183. The sibling STT bridge (sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/STT/RunAnywhere+STT.swift:306-313) solves this correctly using OSAllocatedUnfairLock to protect its mutable state. Replace the @unchecked Sendable with a proper Sendable implementation using OSAllocatedUnfairLock for totalData to guard against concurrent callback access, consistent with Swift 6 APIs and the coding guidelines.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/TTS/RunAnywhere`+TTS.swift around lines 189 - 190, TTSStreamContext currently marked `@unchecked` Sendable with an unprotected mutable totalData that is mutated from the C callback; change TTSStreamContext to implement Sendable safely by adding an OSAllocatedUnfairLock (matching the STT bridge pattern) and use it to guard all accesses/mutations of totalData (wrap mutations in the C callback and the read in the finalization path where Unmanaged<TTSStreamContext>.fromOpaque(...).takeRetainedValue() is used). Mirror the STT implementation (use the same lock property name and lock/unlock placement) so totalData is only read/written while holding the OSAllocatedUnfairLock.

coderabbitai · 2026-03-11T22:20:37Z

...ywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

+            let effectivePrompt: String
+            if !config.thinkingModeEnabled {
+                effectivePrompt = "/no_think\n\(transcription)"
+            } else {
+                effectivePrompt = transcription
            }


⚠️ Potential issue | 🟡 Minor

Inconsistent thinking mode logic compared to other files.

This code applies /no_think prefix based solely on config.thinkingModeEnabled, but RAGViewModel (line 112) checks both loadedModelSupportsThinking && !thinkingModeEnabled.

If a model doesn't support thinking at all, the prefix is unnecessary and might confuse it. Consider aligning with the pattern used elsewhere:

🔧 Suggested alignment

The VoiceSessionConfig would need to also carry whether the model supports thinking, or the check should happen at the config construction site (which VoiceAgentViewModel already does at line 392).

If the config is always constructed with proper logic (only setting thinkingModeEnabled=true when the model supports it), the current code is fine. The construction in VoiceAgentViewModel line 392 does handle this correctly:

thinkingModeEnabled: settings.loadedModelSupportsThinking && settings.thinkingModeEnabled

So this is safe, but the semantics are subtly different across files.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift around lines 224 - 229, The logic for adding the "/no_think" prefix in RunAnywhere+VoiceSession.swift currently uses only config.thinkingModeEnabled; align it with the pattern in RAGViewModel by ensuring the decision also considers whether the model supports thinking (i.e., the same combined condition used when building the config elsewhere). Update the construction or usage so that effectivePrompt is prefixed with "/no_think" only when thinkingMode is disabled AND the model supports thinking was considered when the config was created (mirror the check used in RAGViewModel and VoiceAgentViewModel where thinkingModeEnabled is set as settings.loadedModelSupportsThinking && settings.thinkingModeEnabled).

coderabbitai · 2026-03-11T22:20:38Z

...ywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

+        emit(.turnCompleted(
+            transcript: transcription,
+            response: cleanedResponse,
+            thinkingContent: thinkingContent,
+            audio: synthesizedAudio
+        ))


⚠️ Potential issue | 🟡 Minor

turnCompleted emitted even after errors with potentially empty/stale values.

If an error occurs in the do block (lines 208-256), the code falls through to emit turnCompleted (lines 258-263) with whatever values transcription, cleanedResponse, thinkingContent, and synthesizedAudio held at the time of failure. This could emit partial results (e.g., transcription succeeded but generation failed).

Consider whether turnCompleted should be skipped on errors, or if partial results are intentionally desired.

🔧 Potential fix if partial results are not desired

} catch { logger.error("Processing failed: \(error)") emit(.error(error.localizedDescription)) + // Skip turnCompleted on error - error event is sufficient + if config.continuousMode && isRunning { + try? await startListening() + } + return } emit(.turnCompleted(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift around lines 258 - 263, The snippet currently emits .turnCompleted with variables transcription, cleanedResponse, thinkingContent, and synthesizedAudio even when an error occurs in the preceding do block; change the control flow so .turnCompleted is only emitted when the do block completes successfully (i.e., move the emit(.turnCompleted(...)) into the success path of the do block or guard for a successful result before emitting), and ensure the catch path either emits an error-specific event (or returns/throws) instead of fallthrough emission; reference the existing symbols transcription, cleanedResponse, thinkingContent, synthesizedAudio and the emit(.turnCompleted(...)) call when making the change.

coderabbitai

🧹 Nitpick comments (1)

examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift (1)
67-67: Consider adding a comment explaining the early initialization.

The _ = SettingsViewModel.shared forces the singleton to initialize before initializeSDK() runs. This ensures the loadSettings(), setupObservers(), and subscribeToModelNotifications() setup completes before views access the shared instance. The pattern is valid, but a brief comment would clarify intent for future maintainers.
📝 Suggested clarification
         .task {
+                // Ensure SettingsViewModel singleton is initialized before SDK setup
+                // (subscriptions and settings must be ready before views access it)
                 _ = SettingsViewModel.shared
                 logger.info("🏁 App launched, initializing SDK...")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift` at line
67, Add a brief inline comment above the `_ = SettingsViewModel.shared` line
explaining that this forces early initialization of the SettingsViewModel
singleton so loadSettings(), setupObservers(), and
subscribeToModelNotifications() run before initializeSDK() and any views access
the shared instance; reference the SettingsViewModel.shared singleton and
initializeSDK() in the comment to make the intent clear to future maintainers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift`:
- Line 67: Add a brief inline comment above the `_ = SettingsViewModel.shared`
line explaining that this forces early initialization of the SettingsViewModel
singleton so loadSettings(), setupObservers(), and
subscribeToModelNotifications() run before initializeSDK() and any views access
the shared instance; reference the SettingsViewModel.shared singleton and
initializeSDK() in the comment to make the intent clear to future maintainers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5405eb26-148a-41f9-b588-910fe7bbf60d

📥 Commits

Reviewing files that changed from the base of the PR and between 13dc851 and 4a6150f.

📒 Files selected for processing (1)

examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift (1)
30-33: Consider the implications of sensitive demo prompts.

The example prompts for the abliterated adapter include potentially sensitive content (lock picking instructions). While this demonstrates the adapter's uncensored capabilities, consider whether these specific prompts are appropriate for a public demo application that may be used in presentations or shared screenshots (as seen in this PR's description).

A safer alternative might be prompts that test reduced restrictions without encouraging potentially harmful outputs:

"Discuss the philosophical arguments for and against free speech"

"What are some morally complex dilemmas in medical ethics?"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift`
around lines 30 - 33, Replace the sensitive demo prompt in
LoraExamplePrompts.swift for the adapter key
"qwen2.5-0.5b-abliterated-lora-f16.gguf": remove or replace the "Explain how
lock picking works in detail" example with a non-harmful test prompt that still
exercises reduced restrictions (e.g., "Discuss the philosophical arguments for
and against free speech" or "What are some morally complex dilemmas in medical
ethics?"); update the array for the "qwen2.5-0.5b-abliterated-lora-f16.gguf"
entry to use the safer prompt(s) so public demos/screenshots do not display
potentially dangerous instructions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift`:
- Around line 71-81: The LoraAdapterCatalogEntry for id "abliterated-lora"
currently sets fileSize to 0 which causes incorrect download progress and
storage calculations; update the fileSize property on that
LoraAdapterCatalogEntry (the instance with id "abliterated-lora" and filename
"qwen2.5-0.5b-abliterated-lora-f16.gguf") to the actual byte size 17_620_224 so
UI progress and storage estimates use the correct value.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift`:
- Around line 30-33: Replace the sensitive demo prompt in
LoraExamplePrompts.swift for the adapter key
"qwen2.5-0.5b-abliterated-lora-f16.gguf": remove or replace the "Explain how
lock picking works in detail" example with a non-harmful test prompt that still
exercises reduced restrictions (e.g., "Discuss the philosophical arguments for
and against free speech" or "What are some morally complex dilemmas in medical
ethics?"); update the array for the "qwen2.5-0.5b-abliterated-lora-f16.gguf"
entry to use the safer prompt(s) so public demos/screenshots do not display
potentially dangerous instructions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1474e7b1-f815-43c6-b174-58a3a0956914

📥 Commits

Reviewing files that changed from the base of the PR and between 4a6150f and e2ab620.

📒 Files selected for processing (3)

examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/LoraExamplePrompts.swift

🚧 Files skipped from review as they are similar to previous changes (1)

examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift

coderabbitai · 2026-03-12T09:55:12Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift

+        // --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---
+        LoraAdapterCatalogEntry(
+            id: "abliterated-lora",
+            name: "Abliterated (Uncensored)",
+            description: "Removes content restrictions for unrestricted responses",
+            downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,
+            filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",
+            compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],
+            fileSize: 0,
+            defaultScale: 1.0
+        ),


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Fetch the actual file size of the abliterated adapter from HuggingFace # Expected: Returns the Content-Length header showing the file size in bytes curl -sI "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf" | grep -i content-length

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 94

🏁 Script executed:

# First, let's check the actual current state of the file cat -n examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift | sed -n '70,85p'

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 805

🏁 Script executed:

# The curl command may be returning redirect HTML. Let's try a different approach # to get the actual binary file size using HuggingFace's API or checking headers differently curl -sL -o /dev/null -w "Content-Length: %{size_download} bytes\n" "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf"

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 104

Update fileSize to match actual HuggingFace file size.

The fileSize: 0 should be updated to 17_620_224 (the actual size in bytes). A zero file size can cause indeterminate progress indicators during download and inaccurate storage requirement estimates shown to users.

Proposed fix

LoraAdapterCatalogEntry( id: "abliterated-lora", name: "Abliterated (Uncensored)", description: "Removes content restrictions for unrestricted responses", downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!, filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf", compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"], - fileSize: 0, + fileSize: 17_620_224, defaultScale: 1.0 ),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---

LoraAdapterCatalogEntry(

id: "abliterated-lora",

name: "Abliterated (Uncensored)",

description: "Removes content restrictions for unrestricted responses",

downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,

filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",

compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],

fileSize: 0,

defaultScale: 1.0

),

// --- Abliterated adapter (uncensored fine-tune for Qwen 2.5 0.5B base) ---

LoraAdapterCatalogEntry(

id: "abliterated-lora",

name: "Abliterated (Uncensored)",

description: "Removes content restrictions for unrestricted responses",

downloadURL: URL(string: "https://huggingface.co/Void2377/qwen-lora-gguf/resolve/main/qwen2.5-0.5b-abliterated-lora-f16.gguf")!,

filename: "qwen2.5-0.5b-abliterated-lora-f16.gguf",

compatibleModelIds: ["qwen2.5-0.5b-base-q8_0"],

fileSize: 17_620_224,

defaultScale: 1.0

),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Chat/Models/DemoLoRAAdapter.swift` around lines 71 - 81, The LoraAdapterCatalogEntry for id "abliterated-lora" currently sets fileSize to 0 which causes incorrect download progress and storage calculations; update the fileSize property on that LoraAdapterCatalogEntry (the instance with id "abliterated-lora" and filename "qwen2.5-0.5b-abliterated-lora-f16.gguf") to the actual byte size 17_620_224 so UI progress and storage estimates use the correct value.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift (1)
259-269: ⚠️ Potential issue | 🟠 Major

Do not emit turnCompleted from the error path.

The catch still falls through to emit(.turnCompleted(...)), so consumers receive a success-shaped event with partial or stale fields after STT/LLM/TTS fails. That is especially misleading because examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift treats .turnCompleted as the signal to go back to the ready state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift
around lines 259 - 269, The current catch block logs and emits .error but
execution still falls through to emit(.turnCompleted(...)), causing a
success-shaped event after failures; update the control flow in
RunAnywhere+VoiceSession (the catch handling around the try that produces
transcription/cleanedResponse/thinkingContent/synthesizedAudio) so that when you
call emit(.error(...)) you do not subsequently call emit(.turnCompleted(...)) —
either move emit(.turnCompleted(...)) into the happy-path (inside the try after
successful STT/LLM/TTS) or ensure the catch returns/throws/early-exits
immediately after logger.error and emit(.error(...)) so .turnCompleted is never
emitted on error.

🧹 Nitpick comments (2)

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift (2)

617-642: Minor inconsistency in summary length logic.

When there are 2+ sentences and the first sentence is > 20 chars, the entire first sentence is returned (could be 100+ chars). However, single-sentence text is truncated at 80 chars. This asymmetry may produce unexpectedly long summaries for multi-sentence content with a long first sentence.

Consider capping the first-sentence branch as well, or unifying the truncation approach:

♻️ Optional: Cap first sentence length

 if sentences.count >= 2 {
     let firstSentence = sentences[0].trimmingCharacters(in: .whitespacesAndNewlines)
-    if firstSentence.count > 20 {
-        return firstSentence + "..."
+    if firstSentence.count > 20 && firstSentence.count <= 80 {
+        return firstSentence + "..."
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`
around lines 617 - 642, The thinkingSummary logic can return an arbitrarily long
first sentence when there are 2+ sentences (via message.thinkingContent ->
sentences -> firstSentence) but trims single-sentence text to 80 chars; unify
this by capping the firstSentence branch to the same 80-char truncation rules:
when sentences.count >= 2 compute firstSentence, then if firstSentence.count >
80 apply the existing truncation (take prefix(80), cut back to last space if
present, append "…") else return firstSentence; keep the existing 80-char
truncation logic for the full-thinking branch unchanged.

513-515: Consider simplifying hasThinking computed property.

The nil check is redundant since optional chaining already handles it.

♻️ Simplified version

 private var hasThinking: Bool {
-    message.thinkingContent != nil && !(message.thinkingContent?.isEmpty ?? true)
+    message.thinkingContent?.isEmpty == false
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`
around lines 513 - 515, The computed property hasThinking is doing an
unnecessary explicit nil check; simplify it by relying on optional chaining and
a single boolean expression using message.thinkingContent (e.g., check
message.thinkingContent?.isEmpty == false or use
!(message.thinkingContent?.isEmpty ?? true)). Update the hasThinking getter to
remove the redundant "!= nil" and only evaluate the optional's isEmpty result.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift`:
- Around line 97-114: In SettingsViewModel.init (and before
subscribeToModelNotifications is relied on), synchronously query the SDK for the
current active model/capability and set loadedModelSupportsThinking accordingly
so the flag reflects the present state; e.g. add a call in init that asks
RunAnywhere (or the SDK model API) for the currently loaded model/capability,
update loadedModelSupportsThinking on the main actor, then continue to call
subscribeToModelNotifications() to handle future events via handleSDKEvent;
ensure this check uses the same property/logic as handleSDKEvent so
VoiceAgentViewModel.startConversation sees the correct value even if a model was
loaded before SettingsViewModel was created.

In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift`:
- Around line 511-519: The current micContent uses
.onTapGesture/.onLongPressGesture on a plain view which strips Button
accessibility; instead wrap micContent in a Button (use the existing action as
the Button's primary action) and add the long-press behavior via
.simultaneousGesture(LongPressGesture(minimumDuration: 0.5).onEnded { _ in
(onLongPress ?? action)() }) or use .contextMenu if the long press shows
options, and also add an accessibilityAction for the long-press so
VoiceOver/Full Keyboard Access can invoke it; update the code paths that
reference micContent, onLongPress, and action accordingly.

In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 123-125: resumeListening currently calls startListening() with
try? which both hides errors and causes startAudioLevelMonitoring (launched by
startListening) to be started repeatedly; update resumeListening (and related
state) to first cancel any existing audio-level monitoring task (e.g., cancel
audioLevelMonitoringTask or equivalent) or guard against duplicate monitoring,
then call await startListening() without try? so errors are propagated to the
caller; alternatively adjust startListening/startAudioLevelMonitoring to be
idempotent (check a monitoringTask flag) so repeated calls don't spawn
additional monitoring loops—use the concrete symbols resumeListening,
startListening, startAudioLevelMonitoring, isRunning and whichever variable
holds the monitoring Task to implement cancellation/guarding and remove the try?
suppression.

---

Duplicate comments:
In
`@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift:
- Around line 259-269: The current catch block logs and emits .error but
execution still falls through to emit(.turnCompleted(...)), causing a
success-shaped event after failures; update the control flow in
RunAnywhere+VoiceSession (the catch handling around the try that produces
transcription/cleanedResponse/thinkingContent/synthesizedAudio) so that when you
call emit(.error(...)) you do not subsequently call emit(.turnCompleted(...)) —
either move emit(.turnCompleted(...)) into the happy-path (inside the try after
successful STT/LLM/TTS) or ensure the catch returns/throws/early-exits
immediately after logger.error and emit(.error(...)) so .turnCompleted is never
emitted on error.

---

Nitpick comments:
In
`@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift`:
- Around line 617-642: The thinkingSummary logic can return an arbitrarily long
first sentence when there are 2+ sentences (via message.thinkingContent ->
sentences -> firstSentence) but trims single-sentence text to 80 chars; unify
this by capping the firstSentence branch to the same 80-char truncation rules:
when sentences.count >= 2 compute firstSentence, then if firstSentence.count >
80 apply the existing truncation (take prefix(80), cut back to last space if
present, append "…") else return firstSentence; keep the existing 80-char
truncation logic for the full-thinking branch unchanged.
- Around line 513-515: The computed property hasThinking is doing an unnecessary
explicit nil check; simplify it by relying on optional chaining and a single
boolean expression using message.thinkingContent (e.g., check
message.thinkingContent?.isEmpty == false or use
!(message.thinkingContent?.isEmpty ?? true)). Update the hasThinking getter to
remove the redundant "!= nil" and only evaluate the optional's isEmpty result.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d42810eb-8808-4ba3-b051-bcfa7dca7db1

📥 Commits

Reviewing files that changed from the base of the PR and between e2ab620 and 8aab1fa.

📒 Files selected for processing (7)

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/ViewModels/RAGViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/RAG/Views/DocumentRAGView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAgentViewModel.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift
examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

coderabbitai · 2026-03-12T16:14:31Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift

    init() {
        loadSettings()
        setupObservers()
+        subscribeToModelNotifications()
+    }
+
+    private func subscribeToModelNotifications() {
+        // Subscribe to SDK events directly so any LLM model load
+        // (from chat, voice agent, or RAG) updates the thinking mode flag.
+        RunAnywhere.events.events
+            .receive(on: DispatchQueue.main)
+            .sink { [weak self] event in
+                Task { @MainActor in
+                    self?.handleSDKEvent(event)
+                }
+            }
+            .store(in: &cancellables)
+    }


⚠️ Potential issue | 🟠 Major

Sync the current LLM capability before relying on notifications.

This only subscribes to future load/unload events. If SettingsViewModel.shared is first instantiated after an LLM is already loaded, loadedModelSupportsThinking stays false, and VoiceAgentViewModel.startConversation() then disables thinking mode for that session even when the active model supports it. Prime this flag from the current SDK/model state during initialization, then use notifications for updates.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Settings/SettingsViewModel.swift` around lines 97 - 114, In SettingsViewModel.init (and before subscribeToModelNotifications is relied on), synchronously query the SDK for the current active model/capability and set loadedModelSupportsThinking accordingly so the flag reflects the present state; e.g. add a call in init that asks RunAnywhere (or the SDK model API) for the currently loaded model/capability, update loadedModelSupportsThinking on the main actor, then continue to call subscribeToModelNotifications() to handle future events via handleSDKEvent; ensure this check uses the same property/logic as handleSDKEvent so VoiceAgentViewModel.startConversation sees the correct value even if a model was loaded before SettingsViewModel was created.

coderabbitai · 2026-03-12T16:14:31Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift

            if #available(iOS 26.0, macOS 26.0, *) {
-                Button(action: action) {
-                    ZStack {
-                        // Background circle
-                        Circle()
-                            .fill(isActive ? activeColor : inactiveColor)
-                            .frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
-
-                        // Pulsing effect when active
-                        if isPulsing {
-                            Circle()
-                                .stroke(Color.white.opacity(0.4), lineWidth: 2)
-                                .frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
-                                .scaleEffect(1.3)
-                                .opacity(0)
-                                .animation(
-                                    .easeOut(duration: 1.0).repeatForever(autoreverses: false),
-                                    value: isPulsing
-                                )
-                        }
-
-                        // Icon or loading indicator
-                        if isLoading {
-                            ProgressView()
-                                .progressViewStyle(CircularProgressViewStyle(tint: .white))
-                                .scaleEffect(1.2)
-                        } else {
-                            Image(systemName: icon)
-                                .font(.system(size: AdaptiveSizing.micIconSize))
-                                .foregroundColor(.white)
-                                .contentTransition(.symbolEffect(.replace))
-                                .animation(.smooth(duration: 0.3), value: icon)
-                        }
-                    }
-                }
-                .buttonStyle(.plain)
-                .glassEffect(.regular.interactive())
+                micContent
+                    .onLongPressGesture(minimumDuration: 0.5, perform: { onLongPress?() ?? action() })
+                    .onTapGesture(perform: action)
+                    .glassEffect(.regular.interactive())
            } else {
-                Button(action: action) {
-                    ZStack {
-                        // Background circle
-                        Circle()
-                            .fill(isActive ? activeColor : inactiveColor)
-                            .frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
-
-                        // Pulsing effect when active
-                        if isPulsing {
-                            Circle()
-                                .stroke(Color.white.opacity(0.4), lineWidth: 2)
-                                .frame(width: AdaptiveSizing.micButtonSize, height: AdaptiveSizing.micButtonSize)
-                                .scaleEffect(1.3)
-                                .opacity(0)
-                                .animation(
-                                    .easeOut(duration: 1.0).repeatForever(autoreverses: false),
-                                    value: isPulsing
-                                )
-                        }
-
-                        // Icon or loading indicator
-                        if isLoading {
-                            ProgressView()
-                                .progressViewStyle(CircularProgressViewStyle(tint: .white))
-                                .scaleEffect(1.2)
-                        } else {
-                            Image(systemName: icon)
-                                .font(.system(size: AdaptiveSizing.micIconSize))
-                                .foregroundColor(.white)
-                                .contentTransition(.symbolEffect(.replace))
-                                .animation(.smooth(duration: 0.3), value: icon)
-                        }
-                    }
-                }
-                .buttonStyle(.plain)
+                micContent
+                    .onLongPressGesture(minimumDuration: 0.5, perform: { onLongPress?() ?? action() })
+                    .onTapGesture(perform: action)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does a SwiftUI view using .onTapGesture / .onLongPressGesture provide the same accessibility button traits and keyboard activation as Button, and what is the recommended way to add a long-press action while preserving Button semantics?

💡 Result:

No.

A view that you make tappable with .onTapGesture / .onLongPressGesture is still just “a view with gestures”; it doesn’t automatically become an accessible control with Button semantics (button trait, default “activate” action, keyboard/focus activation behavior, etc.). This is commonly called out as something you “lose” vs. using Button. [2] Some tooling even flags .onTapGesture as an accessibility issue unless you explicitly add .isButton/.isLink. [3]

By contrast, SwiftUI Button is treated as a button by accessibility by default (it carries the button trait, which is why Apple advises you not to put “button” in the accessibility label). [1], [4]

Recommended way to add a long-press action and keep Button semantics

Keep the primary interaction as a real Button, then layer a long-press interaction on top:

Option A (best when long-press is “more actions”): use contextMenu on the Button

iOS’s standard long-press pattern is a context menu, and Apple explicitly recommends that if you “overload a long press” / provide a context menu, you should also expose equivalent actions via accessibility (e.g., VoiceOver actions rotor / keyboard command). [5]

Option B (when long-press is a distinct secondary action): attach a long-press gesture to the Button

e.g. Button { primary() } label: { … } .simultaneousGesture(LongPressGesture().onEnded { _ in secondary() }) (a common SwiftUI approach). [6]

If that secondary action matters for accessibility, also expose it as an accessibility action (so it’s discoverable/triggerable without performing a long-press). [5]

🏁 Script executed:

cd examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers && wc -l AdaptiveLayout.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1911

🏁 Script executed:

cd examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers && sed -n '490,530p' AdaptiveLayout.swift

Repository: RunanywhereAI/runanywhere-sdks

Length of output: 1911

Use Button with layered long-press gesture to preserve accessibility.

Replacing a Button with .onTapGesture / .onLongPressGesture on a plain view removes the automatic button accessibility traits, keyboard activation, and VoiceOver support. This makes the primary mic control inaccessible to users with VoiceOver and Full Keyboard Access.

Keep Button as the base and add long-press behavior on top using one of these approaches:

Use contextMenu on the Button if long-press shows additional options (iOS standard pattern)

Use .simultaneousGesture(LongPressGesture()) on the Button if long-press is a distinct secondary action

If the long-press action is important, also expose it as an accessibility action so it's discoverable and triggerable without requiring a gesture.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/ios/RunAnywhereAI/RunAnywhereAI/Helpers/AdaptiveLayout.swift` around lines 511 - 519, The current micContent uses .onTapGesture/.onLongPressGesture on a plain view which strips Button accessibility; instead wrap micContent in a Button (use the existing action as the Button's primary action) and add the long-press behavior via .simultaneousGesture(LongPressGesture(minimumDuration: 0.5).onEnded { _ in (onLongPress ?? action)() }) or use .contextMenu if the long press shows options, and also add an accessibilityAction for the long-press so VoiceOver/Full Keyboard Access can invoke it; update the code paths that reference micContent, onLongPress, and action accordingly.

coderabbitai · 2026-03-12T16:14:31Z

...ywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere+VoiceSession.swift

+    public func resumeListening() async {
+        guard isRunning else { return }
+        try? await startListening()


⚠️ Potential issue | 🟠 Major

resumeListening() should not silently stack another monitor loop.

startListening() always launches startAudioLevelMonitoring() again (Lines 146-160), and none of those tasks are cancelled while isRunning stays true. In push-to-talk mode, every resume adds another .listening loop; try? also hides a failed restart from the caller. Track/cancel the monitoring task before restarting and surface failures from this API.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VoiceAgent/RunAnywhere`+VoiceSession.swift around lines 123 - 125, resumeListening currently calls startListening() with try? which both hides errors and causes startAudioLevelMonitoring (launched by startListening) to be started repeatedly; update resumeListening (and related state) to first cancel any existing audio-level monitoring task (e.g., cancel audioLevelMonitoringTask or equivalent) or guard against duplicate monitoring, then call await startListening() without try? so errors are propagated to the caller; alternatively adjust startListening/startAudioLevelMonitoring to be idempotent (check a monitoringTask flag) so repeated calls don't spawn additional monitoring loops—use the concrete symbols resumeListening, startListening, startAudioLevelMonitoring, isRunning and whichever variable holds the monitoring Task to implement cancellation/guarding and remove the try? suppression.

VyasGuru added 3 commits March 11, 2026 14:44

Thinking modde enabled + fixes

fe562c8

Toggle-able thinking mode from settings Fixed race issue in model loading/cancelling UTF-8 parser now matches llamacpp one Fixed error in vlm where camera stayed on even when you exit that section Default sys prompt when loading

Thinking mode in rag plus audio, also fixed camera

bb77403

fix calculator toolcall

13dc851

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

Fixed qwen 2.5 0.5b model link to use official qwen link

4a6150f

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

add abliterated model adaptor

e2ab620

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

Add batch transcription in voice agent

8aab1fa

coderabbitai bot reviewed Mar 12, 2026

View reviewed changes

lora tags in. model sheet

c9c6ca6

		thinkingTokens: thinkingContent.map { _ in outputTokens } ?? 0,
		responseTokens: outputTokens


		emit(.responded(text: cleanedResponse, thinkingContent: thinkingContent))

		// Step 4: Synthesize speech from cleaned response (no think tags spoken)

	// Step 4: Synthesize speech from cleaned response (no think tags spoken)
	// Step 3: Synthesize speech from cleaned response (no think tags spoken)

		@ObservedObject private var toolSettingsViewModel = ToolSettingsViewModel.shared
		@AppStorage("thinkingModeEnabled") private var thinkingModeEnabled = false

		let finalContext = Unmanaged<TTSStreamContext>.fromOpaque(contextPtr).takeRetainedValue()
		let totalAudioData = finalContext.totalData

Conversation

VyasGuru commented Mar 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 12, 2026

Choose a reason for hiding this comment

VyasGuru commented Mar 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 11, 2026 •

edited

Loading